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1.  Introduction 

The  Polymorphous  Computing  Architectures  (PCA)  program  is  a 
Defense  Advanced  Research  Projects  Agency  (DARPA)  effort  to 
develop  new  embedded  computing  platforms  with  very  strong, 
rapid  in-mission  reconfigurability.  Target  applications  range 
from  military  platforms  that  must  adapt  to  rapidly  changing 
mission  parameters,  to  embedded  network  controllers  whose 
optimal  configuration  of  hardware  resources  will  change  in 
response  to  the  traffic  and  environmental  conditions  they  face. 

The  PCA  program  “core  projects”  working  to  develop 
microprocessors  that  implement  polymorphous  capabilities 
include  Smart  Memories,  Raw,  TRIPS,  and  MONARCH; 
references  for  all  are  available  in  [1].  The  chips  under 
development  in  these  projects  have  several  characteristics  in 
common.  These  are  typically  tiled  structures  composed  from 
replicated,  fully  capable  computing  cores,  reconfigurable  memory 
and  cache  elements,  and  a  rich  set  of  reconfigurable  data  paths, 
network  interfaces,  and  I/O  paths.  Each  can  operate  in  streaming 
or  threaded  modes.  Each  has  mechanisms  for  aggregating 
individual  processor  tiles  into  larger  compound  processor  units. 
They  differ  in  their  approach  for  aggregating  processors  and  in 
their  emphasis  on  processor,  memory,  or  communication  design. 


device  could  be  optimized  for  stream  processing  and  dedicated  to 
a  sensor  processing  dataflow  computation,  while  another  is 
configured  for  conventional  thread  processing  and  allocated  to 
conventional  control  processing.  Furthermore,  the  number  of  chip 
tiles  dedicated  to  each  processor  type  could  be  varied  based  on 
expected  loads. 

To  exploit  the  capabilities  of  PCA  hardware  while  retaining  as 
much  end-user  portability  and  performance  as  possible,  the 
Morph  ware  Forum  (www.morphware.org),  an  informal 
consortium  of  the  PCA  contractors  and  other  selected  participants, 
is  creating  an  application  development  framework,  called  the 
Morphware  Stable  Interface  (MSI).  An  overview  of  the  MSI  is 
available  in  [1]. 

A  key  capability  envisioned  for  PCA  systems  is  morphing ,  the 
reconfiguration  and  re-allocation  of  PCA  hardware  resources 
within  a  chip  in  response  to  various  events..  Morphing  is 
fundamentally  enabled  by  the  reconfigurable  hardware 
microarchitecture  of  PCA  chips,  but  is  made  accessible  to  the 
programming  environment  through  the  MSI.  Thus,  a  major 
software  design  issue  for  the  PCA  program  is  how  to  structure  the 
MSI  so  as  to  support  morphing  while  maintaining  portability 
across  the  various  PCA  targets. 
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2.  Types  of  Morphing 

The  MSI  envisions  a  component-based  application  software 
architecture.  Components  provide  natural  and  intuitive 
boundaries  for  run-time  reconfiguration  of  hardware.  In  general, 
multiple  implementations  of  various  units  of  functionality  ( e.g .,  a 
fast  Fourier  transform)  will  exist  as  different  components,  each 
offering  different  trades  of  performance  and  system  requirements, 
and  capable  of  being  compiled  to  differing  amounts  of  hardware 
resources.  Morphing  then  implies  changing  either  the  particular 
component  implementations  in  use,  the  resources  assigned  to  the 
components,  or  both.  Different  types  of  morphing  can  thus  be 
classified  based  on  three  orthogonal  characteristics: 

•  whether  the  application  code  directly  makes  an 

application  programming  interface  (API)  call  to  initiate 
morphing,  or  it  is  done  invisibly  to  the  user  by  either  the 
run-time  system  or  the  compiler; 


Figure  1.  Generic  PCA  chip  micro-architecture. 

Figure  1  illustrates  a  generic  PCA  microarchitecture. 

This  ability  to  aggregate  varying  numbers  and  types  of  elements 
on  a  PCA  chip  means  that  the  chip  can  be  effectively  partitioned 
into  multiple  processors  of  similar  or  different  types,  with  each 
partition  assigned  to  a  different  portion  of  the  application  program 
or  even  to  different  programs.  For  instance,  one  portion  of  a  PCA 


•  whether  the  component  continues  to  execute  or  is 
reloaded  or  replaced  with  an  alternate  component;  and 

•  whether  the  resources  allocated  to  the  application  must 
change  or  stay  the  same 

The  Morphware  Forum  has  developed  a  taxonomy  of  morph 
types,  summarized  in  Table  1,  to  describe  the  various  situations. 
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Table  1.  PCA  Morphing  Taxonomy 
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3.  Morphing  in  the  MSI 

3.1  Compilation  Architecture 

Portability  across  alternative  PCA  target  devices  is  obtained  in  the 
MSI  by  using  a  two-level  compilation  architecture,  as  shown  in 
Figure  2.  The  application  program  is  partitioned  by  the  user  or  by 
tools  yet  to  be  developed  into  streaming  units  and  non-streaming 
(conventional)  units.  The  former  are  expressed  in  a  specialized 
streaming  language  such  as  Brook  or  Streamlt,  while  the  latter  are 
in  conventional  C  or  C++  code.  The  high  level  compiler  (HLC) 
takes  in  this  user  source  code  as  well  as  a  machine  model  (MM),  a 
metadata  description  of  the  resources  available  on  the  PCA 
devices  to  the  compilation  unit  and  their  configuration. 
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Figure  2.  MSI  Compilation  Architecture. 

The  HLC  compiles  the  streaming  input  units  to  a  stream  virtual 
machine  (SVM)  description.  The  HLC  utilizes  the  information  in 
the  particular  MM  provided  with  the  source  code  to  optimize  the 
coarse-grain  parallelization  of  the  streaming  program  unit.  Thus, 
the  same  application  code  will  produce  different  SVM  codes, 
depending  on  the  machine  model  description  of  the  available 
resources.  This  mechanism  provides  the  basic  capability  for 
portability  across  multiple  target  machines,  as  well  as  the 
capability  to  vary  the  amount  of  resources  assigned  to  a  functional 
unit  within  the  same  machine. 


Threaded  code  is  expressed  in  terms  of  a  thread  virtual  machine 
(TVM),  in  turn  composed  of  a  user-level  VM  (UVM)  and  a 
hardware  architecture  layer  (HAL).  Other  than  expressing  the 
output  in  these  machine-neutral  APIs,  the  HLC  largely  passes 
threaded  code  to  the  output  without  optimization.  The  machine- 
specific  low  level  compilers  (LLCs)  then  compile  the  VM  code 
for  their  particular  target  PCA  machine,  performing  further  fine¬ 
grained  parallelization  and  optimizations  as  appropriate. 

3.2  Morphing  Mechanisms 

As  seen  in  Table  1,  some  morphing  operations  are  defined  by  the 
compiler,  while  others  are  controlled  by  the  run-time  system, 
most  likely  in  the  form  of  a  yet-to-be-defmed  resource  manager. 
Thus,  morphing  is  actually  implemented  by  various  levels  of  the 
MSI,  depending  on  the  type  of  morph.  Compiler-directed  morphs 
can  be  the  result  of  changes  in  the  machine  model  for  the  target 
hardware,  of  coarse-grain  optimization  decisions  made  by  the 
HLC,  or  of  fme-grain  configuration  decisions  made  by  the  LLC. 
Morphs  that  change  the  executing  components  must  be  initiated 
and  controlled  by  the  run-time  system  of  the  PCA  machine. 

Several  methods  for  representing  the  various  forms  of  morphing 
have  been  proposed,  including  modeling  morphs  as  program 
branches,  explicit  control  of  variables  representing  machine  state 
at  the  SAPI  or  SAAL  levels,  and  marking  sections  of  code  with 
performance  and  resource  constraint  expressions.  The  candidate 
approaches  to  date  will  be  described  and  compared,  considering 
such  issues  as  the  level  of  the  MSI  at  which  they  are 
implemented;  granularity;  and  visibility  to  and  controllability  by 
the  programmer  Selecting  and  vetting  the  appropriate  interfaces 
to  represent  morphing  is  currently  a  primary  focus  of  the 
Morph  ware  Forum. 
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Morphing  in  PCA  Architectures 


■  The  DARPA  Polymorphous  Computing  Architectures 
(PCA)  program  is  developing  embedded  high- 
performance  computing  platforms  with  strong,  rapid 
reconfigurability 

■  PCA  processors  are  essentially 
“multiprocessors  on  a  chip” 

□  tiled  architectures 

□  reconfigurable  processing 
aggregates 

□  reconfigurable  networks 

■  “Morphing”  is  the  reconfiguration  and  re-allocation  of 
PCA  hardware  resources  within  a  chip  in  response  to 
various  events 
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□  key  capability  to  achieve  PCA  goals 

□  portability  across  PCA  chips  must  be  maintained 
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PCA  Two-Level  Module  Compilation  Architecture 


■  Two-level  compile  +  customizable  machine  models 
enables  targeting  of  same  functionality  to  multiple 
machine  configurations 
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Development  Process 

□  Two-stage  compile  process  enables  portable 
performance  across  PCA  architectures 


Morphware  Stable  Interface  Architecture 

SAPfandSAAL 

□  Two  intermediate  representations 

■  Stable  API:  application  code  in  C/C++  and  a  stream 
language  such  as  Brook  or  Streamit 

■  Stable  Architecture  Abstraction  Layer:  PCA  virtual 
machine  code 


Machine  Models 

Used  to  optimize  VM 
output  for  different 
target  platforms 

■  Coarse  grain  mapping  of 
application  to  target 
resources 
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The  Morphware  Stable  Interface 

□  Standard  PCA  Application  Environment 

■  Defined  by  a  set  of  open  standards  documents 

□  Based  on  a  virtual  machine  (VM)  abstraction 
layer  with  standardized  metadata  and 
programming  languages 

□  Goals 

■  Foster  software  portability  across  PCA  architectures 

■  Dynamically  optimize  PCA  resources  for  application 
functionality,  service  requirements,  and  constraints 

■  Obtain  nearly  optimal  performance  from  PCA  hardware 

■  Be  highly  reactive  to  PCA  hardware  and  user  inputs 

■  Manage  PCA  software  complexity 

■  Leverage  existing  and  developing  technologies 

□  Cross-project  effort,  developed  in  parallel  with 
the  hardware 
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